Bioinformatics (Thomas Dandekar, Meik Kunz)

166

https://upload.wikimedia.org/

wikipedia/commons/7/70/Aminoacids_table.svg

(determined by BLAST sequence comparison with high sequence similarity to a biochem

ically verified protein) should then also have a nuclear localization signal (determined, for

example, by the ELM server, the “eukaryotic linear motif server”), and the domain com

position (determined by the SMART database; Letunic et al. 2015) should confirm the

transcription factor found by a DNA-binding domain. After all, everything has to match

because we always assumed the same sequence. Conversely, the different bioinformatics

algorithms check and correct each other. In a living cell, the domains in the protein have

to fit together correctly.

Learning to better understand this genetic “language of life” was, at least for me, a

major reason to learn bioinformatics – and the computer is only one, albeit very powerful,

tool for this.

Another way to approach this aspect of the language of life is through the proteins

themselves. Their richness can be viewed directly with the Pfam database (all protein

families; pfam.xfam.org) or UniProt (database of all known proteins and protein sequences;

https://www.uniprot.org). This makes it much easier to understand the huge number of different

12 Life Continuously Acquires New Information in Dialogue with the Environment